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Abstract 

We discuss how recently discovered techniques and tools from compressed sensing can be used in tensor decompo- 
sitions, with a view towards modeling signals from multiple arrays of multiple sensors. We show that with appro- 
priate bounds on a measure of separation between radiating sources called coherence, one could always guarantee 
the existence and uniqueness of a best rank-r approximation of the tensor representing the signal. We also deduce 
a computationally feasible variant of Kruskal's uniqueness condition, where the coherence appears as a proxy for 
A;-rank. Problems of sparsest recovery with an infinite continuous dictionary, lowest-rank tensor representation, and 
blind source separation are treated in a uniform fashion. The decomposition of the measurement tensor leads to 
simultaneous localization and extraction of radiating sources, in an entirely deterministic manner. 

Resume 

Traitement du signal multi-antenne : les decompositions tensorielles rejoignent I'echantillonnage compresse. 

Nous decrivons comment les techniques et outils d'echantillonnage compresse recemment decouverts peuvent etre uti- 
lises dans les decompositions tensorielles, avec pour illustration une modelisation des signaux provenant de plusieurs 
antennes multicapteurs. Nous montrons qu'en posant des bornes appropriees sur une certaine mesure de separation 
entre les sources rayonnantes (appelee coherence dans le jargon de I'echantillonnage compresse), onpouvait toujours 
garantir I'existence et I'unicite d'une meilleure approximation de rang r du tenseur representant le signal. Nous en 
deduisons aussi une variante calculable de la condition d'unicite de Kruskal, oil cette coherence apparait comme une 
mesure du A; -rang. Les problemes de recuperation parcimonieuse avec un dictionnaire infini continu, de representation 
tensorielle de plus bas rang, et de separation aveugle de sources sont ainsi abordes d'une seule et meme fa^on. La 
decomposition du tenseur de mesures conduit a la localisation et a F extraction simultanees des sources rayonnantes, 
de maniere entierement deterministe. 

Keywords: Blind source separation, blind channel identification, tensors, tensor rank, polyadic tensor 
decompositions, best rank-r approximations, sparse representations, spark, A;-rank, coherence, multiarrays, 
multisensors 

Mots-cles : Separation aveugle de sources, identification aveugle de canal, tenseurs, rang tensoriel, decompositions 
tensorielles polyadiques, meilleure approximation de rang r, representations parcimonieuses, spark, A;-rang, 
coherence, antennes multiples, multicapteurs 



Version fran^aise abregee 

Nous expliquons comment les decompositions tensorielles et les modeles d' approximation apparaissent naturel- 
lement dans les signaux multicapteurs, et voyons comment 1' etude de ces modeles peut etre enrichie par des contri- 
butions provenant de I'echantillonnage compresse. Le vocable echantillonnage compresse est a prendre au sens large, 
englobant non seulement les idees couvertes par |[il|4][7jil4,il5iil8J, mais aussi les travaux sur la minimisation du 
rang et la completion de matrice 01 151 [TTl [191 1311 l38l . 
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Nous explorons notamment deux themes : (1) I'utilisation de dictionnaires redondants avec des homes sur les 
produits scalaires entre leurs elements ; (2) le recours a la coherence ou au spark pour prouver I'unicite. En particulier, 
nous verrons comment ces idees peuvent etre etendues aux tenseurs, et appliquees a leur decomposition et leurs ap- 
proximations. Si nous qualifions les travaux [[1] |4] |71 [14] [15] US) d' "echantillonnage compresse de formes lineaires" 
(variables vectorielles) et |[3] [5] [17] [19] 121] [38] d' "echantillonnage compresse de formes bilineaires" (variables matri- 
cielles), alors cet article porte sur F echantillonnage compresse de formes multilineaires (variables tensorielles). 

Les approximations tensorielles recelent des difficultes dues a leur caractere mal pose 19] [121, et le calcul de la 
plupart des problemes d'algebre multilineaire sont de complexite non polynomiale (NP-durs) Il20ll22l . En outre, il 
est souvent difficile ou meme impossible de repondre dans le cadre de la geometric algebrique a certaines questions 
fondamentales concemant les tenseurs, cadre qui est pourtant usuel pour formuler ces questions (cf Section|4|. Nous 
verrons que certains de ces problemes pourraient devenir plus abordables si on les deplace de la geomerie algebrique 
vers r analyse harmonique. Plus precisement, nous verrons comment les concepts glanes aupres de F echantillonnage 
compresse peuvent etre utiUses pour attenuer certaines difficultes. 

Enfin, nous montrons que si les sources sont suffisamment separees, alors il est possible de les localiser et de les 
extraire, d'une maniere completement deterministe. Par "suffisamment separees", on entend que certains produits sca- 
laires soient inferieurs a un seuil, qui diminue avec le nombre de sources presentes. Dans le jargon de F echantillonnage 
compresse, la "coherence" designe le plus grand de ces produits scalaires. En posant des bornes appropriees sur cette 
coherence, on peut toujours garantir Fexistence et I'unicite d'une meilleure approximation de rang r d'un tenseur, et 
par consequent I'identifiabilite d'un canal de propagation d'une part, et I'estimation des signaux source d' autre part. 

1. Introduction 

We discuss how tensor decomposition and approximation models arise naturally in multiarray multisensor signal 
processing and see how the studies of such models are enriched by mathematical innovations coming from compressed 
sensing. We interpret the term compressed sensing in a loose and broad sense, encompassing not only the ideas 
covered in lfni4ir7l [T4l[T5l[T8]| but also the line of work on rank minimization and matrix completion in |[3] [5] [TTI [T9l 
|3T|[38l. We explore two themes in particular: (1) the use of overcomplete dictionaries with bounds on coherence; 
(2) the use of spark or coherence to obtain uniqueness results. In particular we will see how these ideas may be 
extended to tensors and applied to their decompositions and approximations. If we view fT] [4] [7] [13] [15] [TS) as 
'compressed sensing of linear forms' (vector variables) and [3, ^Sj [17] [191 [ST] [38J as 'compressed sensing of bilinear 
forms' (matrix variables), then this article is about 'compressed sensing of multilinear forms' (tensor variables), where 
these vectors, matrices, or tensors are signals measured by sensors or arrays of sensors. 

Tensor approximations are fraught with ill-posedness difficulties |(9] [121 and computations of most multilinear 
algebraic problems are NP-hard Ii20ll22ll . Furthermore even some of the most basic questions about tensors are often 
difficult or even impossible to answer within the framework of algebraic geometry, the usual context for formulating 
such questions (cf. Section |4]i. We will see that some of these problems with tensors could become more tractable 
when we move from algebraic geometry to slightly different problems within the framework of harmonic analysis. 
More specifically we will show how wisdom gleaned from compressed sensing could be used to alleviate some of 
these issues. 

This article is intended to be a short communication. Any result whose proof requires more than a few lines of 
arguments is not mentioned at all but deferred to our full paper [11]. Relations with other aspects of compressed sens- 
ing beyond the two themes mentioned above, most notably exact recoverability results under the restricted isometry 
property [21 or coherence assumptions |[35l . are also deferred to IfTTI . While the discussions in this article are limited 
to order-3 tensors, it is entirely straightforward to extend them to tensors of any higher order 

2. Multisensor signal processing 

Tensors are well-known to arise in signal processing as higher order cumulants in independent component analysis 
m and have been used successfully in blind source separation |[TOll . The signal processing application considered here 
is of a different nature but also has a natural tensor decomposition model. Unlike the amazing single-pixel camera 1 161 
that is celebrated in compressed sensing, this apphcation comes from the opposite end and involves multiple arrays 
of multiple sensors M30I . 
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Consider an array of I sensors, each located in space at a position defined by a vector b, e Mr', i = 
Assume this array is impinged by r narrow-band waves transmitted by independent radiating sources through a linear 
stationary medium. Denote by o-p{tk) the complex envelope of the p\h source, I < p < r, where tk denotes a point 
in time and k - If the location of source p is characterized by a parameter vector Op, the signal received by 

sensor / at time tk can be written as 

Si{k)^Yl^^^^<Tp{tk)si{ep) (1) 

where e, characterizes the response of sensor / to external excitations. 

Such multisensor arrays occur in a variety of applications including acoustics, neuroimaging, and telecommu- 
nications. The sensors could be antennas, EEG electrodes, microphones, radio telescopes, etc, capturing signals in 
the form of images, radio waves, sounds, ultrasounds, etc, emanating from sources that could be cell phones, distant 
galaxies, human brain, party conversations, etc. 

Example 2.1. For instance, if one considers the transmission of narrowband electromagnetic waves over air, SiiOp) 
can be assimilated to a pure complex exponential (provided the differences between time delays of arrival are much 
smaller than the inverse of the bandwidth): 

SiiOp) ^ exp(tA,,p), «A/,p := (b^d,, - ^ ||b; A dpH^j (2) 

where the p''^ source location is defined by its direction dp e and distance Rp from an arbitrarily chosen origin 
O, (jj denotes the central pulsation, c the wave celerity, — - 1, and A the vector wedge product. More generally, 
one may consider ipjp to be a sum of functions whose variables separate, i.e. tpi p — {(iYg(p), where f(i) and g(p) are 
vectors of the same dimension. Note that if sources are in the far field (Rp s> Ij, then the last term in the expression 
ofij/i^p in (|2| may be neglected. 

2.1. Structured multisensor arrays 

We are interested in sensor arrays enjoying an invariance property. We assume that there are m arrays, each having 
the same number / of sensors. They do not need to be disjoint, that is, two different array may share one or more 
sensors. 

From Q, the signal received by the /th array, /' = 1 , . . . , m, takes the form 

suAk)^Y^^^^^(Tp{tk)eij{ep). (3) 

The invariance property[^that we are interested in can be expressed as 

ei,j{ep)^ei,i{ep)ip{j,p). (4) 

In other words, variables / and j decouple. 

This property is encountered in the case of arrays that can be obtained from each other by a translation (see figure). 
Assume sources are in the far field. Denote by the vector that allows deduction of the locations of sensors in the yth 
array from those of 1st array. Under these hypotheses, we have for the first array, (/^/^ i = i^QiJdp). By a translation 
of Ay we obtain the phase response of the yth array as: 



= (-(b/dp -H Ajdp). 



Observe that indices / and j decouple upon exponentiation and that we have p) - exp [i^ ^J*^p)- 
Now plug the invariance expression Q into Q to obtain the observation model: 

^ijk) ^^^^^e^{ep)ip{j,p)o-p{tk), 1,...,/; l,...,m; ^= 



1. So called as the property follows from translation invariance: angles of anival remain the same. The term was probably first used in |33 |. 
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(a) (b) (c) 

Figure 1 : From the same array of sensors, several subaiTays can be defined that differ from each other by translation: (a) two (overlapping) subarrays 
of 4 sensors, (b) two subaiTays of 3 sensors, (c) three subarrays of 2 sensors. 



This simple multilinear model is the one that we shall discuss in this article. Note that the left hand side is measured, 
while the quantities on the right hand side are to be estimated. If we rewrite aijk - Si j(k), uip - Si^iiOp), vjj, - (pij, p), 
wicp = a-p{tk) (where the 'hat' indicates that the respective quantities are suitably normalized) and introduce a scalar 
Ap to capture the collective magnitudes, we get the tensor decomposition model 

aijk = '^p UipVjpWkp, / = j = 1, . . . , ot; 

with llupll = llvpll = llwpll = 1. In the presence of noise, we often seek a tensor approximation model with respect to 
some measure of nearness, say, a sum-of-squares loss that is common when the noise is assumed white and Gaussian: 

T l,m,n 



Zl,m,n •s—~\ r 



Our model has the following physical interpretation: if aijt is the array of measurements recorded from sensor / of 
subarray j at time k, then it is ideally written as a sum of r individual source contributions Yjp=i ■^p ^^ip ^jp ^kp- Here, 
Uip represent the transfer functions among sensors of the same subarray, vjp the transfer between subarrays, and w^p 
the discrete-time source signals. All these quantities can be identified. In other words, the exact way one subarray can 
be deduced from the others does not need to be known. Only the existence of this geometrical invariance is required. 



3. Tensor rank 

Let Vi , . . . , Vi; be vector spaces over a field, say, C. An element of the tensor product Vi ® ■ ■ ■ ® V/; is called an 
order-A; tensor or k-tensor for short. Scalars, vectors, and matrices may be regarded as tensors of order 0, 1, and 2 
respectively. For the purpose of this article and for notational simplicity, we will limit our discussions to 3-tensors. 
Denote by I, m, n the dimensions of Vi, V2, and V3, respectively. Up to a choice of bases on Vi, V2, V3, a 3-tensor in 
Vi ® V2 ® V3 may be represented by an Z x m x n array of elements of C, 

4 r„ \U>n,n ^ fnilxmxn 

A = {aijk)ij,k=i ^ ^ 

These are sometimes called hypermatrices^and come equipped with certain algebraic operations inherited from the 
algebraic structure of Vi ® V2 ® V3. The one that interests us most is the decomposition of A = (0,7*:) e C'*^'"^" as 

^ " '^P Up ® ® Wp, aijk = ^p '^ip^JpWkp, (5) 

with zip e C, Up e C',Vp e C',Wp e C". For u - . . . , m;]^, v = [vi, . . . , Vm]^, w = [wi, ... ,w„]^, we write 
u (8> V ® w := (M/V'yWi)!'"'^" J 6 C'^"'^". This generalizes u ® v = uv^ in the case of matrices. 

A diff'erent choice of bases on Yi,. . .,Wk would lead to a different hypermatrix representation of elements in 
Vi igi • • ■ (81 Vj. For the more pedantic readers, it is understood that what we call a tensor in this article really means a 
hypermatrix. The decomposition of a tensor into a linear combination of rank-1 tensors was first studied in li21J . 



2. The subscripts and superscripts will be dropped when the range of (', j, k is obvious or unimportant. 
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Definition 3.1. A tensor that can be expressed as an outer product of vectors is called decomposable ( or rank-one if 

it is also nonzero). More generally, the rank of a tensor A — ('^ijk)'i"j'^'^i £ (£ixmxn^ denoted rank(A), is defined as the 
minimum r for which A may be expressed as a sum of r rank-l tensors, 

rank(A) := minjr I A = ^ ^ /Ip Uy, ® v,, ® Wpj. (6) 

We will call a decomposition of the form (|5]l a rank- revealing decomposition when r = rank(A). The definition of 
rank in (|6]) agrees with matrix rank when applied to an order-2 tensor 

£</xmx« jg ^ Hilbert space of dimension Imn, equipped with the Frobenius (or Hilbert-Schmidt) norm, and its 
associated scalar product; 



IIAII, 



ijkOijk- 



One may also define tensor norms that are the C equivalent of Frobenius norm ll27l and tensor norms that are analo- 
gous to operator norms of matrices Il22l . 



4. Existence 

The problem that we consider here is closely related to the best r-term approximation problem in nonlinear ap- 
proximations, with one notable difference — our dictionary is a continuous manifold, as opposed to a discrete set, 
of atoms. We approximate a general signal v e H with an r-term approximant over some dictionary of atoms D, 
i.e. £) c H and span(£)) = H. We refer the reader to fT\ for a discussion of the connection between compressed 
sensing and nonlinear approximations. We denote the set of r-term aproximants by {/liVi -i- ■ ■ ■ H- A^x,- 6 EI | 

Vi , . . . , Vr e £), /li , . . . , /I,- e C). Usually D is finite or countable but we have a continuum of atoms comprising all 
decomposable tensors. The set of decomposable tensors 

Seg(l,m,n) := {A e C"""''" | rank(A) < 1) = {x ® y ® z | x e C', y e C", z e C") 

is known in geometry as the Segre variety. It has the structure of both a smooth manifold and an algebraic variety, 
with dimension l + m + n (whereas finite or countable dictionaries are 0-dimensional). The set of r-term approximants 
in our case is the rth secant quasiprojective variety of the Segre variety, i;r(Seg(/, m, n)) = {A e C'^"'^" | rank(A) < r}. 
Such a set may not be closed nor irreducible. In order to study this set using standard tools of algebraic geometry 
|l6l|26l[37l, one often considers a simpler variant called the rth secant variety of the Segre variety, the (Zariski) closure 
of Sr(Seg(Z, m, n)). Even with this simplification, many basic questions remain challenging and open: For example, it 
is not known what the value of the generic rankj^is for general values of /, m, n JS); nor are the polynomial equations]^ 
defining the rth secant variety known in general [261. 

The seemingly innocent remark in the preceding paragraph that for r > 1, the set {A e C'^™^" | rank(A) < r) is in 
general not a closed set has imphcation on the model that we proposed. Another way to view this is that tensor rank 
for tensors of order 3 or higher is not an upper semicontinuous function fTZ| . Note that tensor rank for order-2 tensors 
(i.e. matrix rank) is upper semicontinuous: if A is a matrix and rank(A) = r, then rank(B) > r for all matrices in a 
sufficiently small neighborhood of A. As a consequence, the best rank-r approximation problem for tensors, 

argmin ||A - /liUi ® Vi ® Wi /l^Ur ® ® Wrllf, (7) 

I|U„I|2=I|V„||2=I|W„||, = 1 

unUke that for matrices, does not in general have a solution. The following is a simple example taken from ||T21 . 



3. Roughly speaking, this is the value of r such that a randomly generated tensor will have rank r. For mx n matrices, the genetic rank is 
min{m, n] but Ixmxn tensors in general have generic rank > min{/, m, n]. 

4. For matrices, these equations are simply given by the vanishing of the kxk minors for all k > r. 
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Example 4.1. Let u,-, V; e C", ; = 1, 2, 3. Let A := Ui ® U2 ® V3 + Ui ® V2 ® U3 + Vi ® U2 ® U3 and for n e N, let 
An :- n (ui + -Vi I ® [wi + -V2I ® |U3 + -¥3] - «Ui ® U2 ® U3. 



n j \ n I \ n 

One may show that rank(A) = 3 ijf Mi, v,- are linearly independent, i — 1, 2, 3. Since it is clear that rank(A„) < 2 by 
construction and lim„_,oo A„ = A, the rank-3 tensor A has no best rank-2 approximation. Such a tensor is said to have 
border rank 2. 

This phenomenon where a tensor fails to have a best rank-r approximation is much more widespread than one 
might imagine, occurring over a wide range of dimensions, orders, and ranks; happens regardless of the choice of 
norm (or even Bregman divergence) used. These counterexamples occur with positive probability and in some cases 
with certainty (in M^^^^^ and C^^^^^, no tensor of rank-3 has a best rank-2 approximation). We refer the reader to IIT2I 
for further details. 

Why not consider approximation by tensors in the closure of the set of all rank-r tensors, i.e. the rth secant variety, 
instead? Indeed this was the idea behind the weak solutions suggested in [12] . The trouble with this approach is that 
it is not known how one could parameterize the rth secant variety in general: While we know that all elements of the 
rth secant quasiprojective variety S;.(Seg(/, m, n)) may be parameterized as /iiUi ® Vi ® Wi -H ■ ■ ■ -H A^vir ®yr® w^, it 
is not known how one could parameterize the limits of these, i.e. the additional elements that occur in the closure of 
E, (Seg(Z, OT, n)), when r > min{/, m, «). More specifically, if r < minjZ, m,n), Terracini's Lemma ll37l provides a way 
to do this since generically a rank-r tensor has the form A\VL\ 181 Vi i8> Wi -h ■ ■ ■ -h /l,.Ur ® ® w,. where (ui, . . . , Ur), 
{vi, . . . , Vr), {wi, . . . ,w,) are linearly independent; but when r > min{/, m, n), this generic linear independence does 
not hold and there are no known ways to parameterize a rank-r tensor in this case. 

We propose that a better way would be to introduce natural a priori conditions that prevent the phenomenon in 



Example 4.1 from occurring. An example of such conditions is nonnegativity restrictions on /l,-,u,, v,, examined in 
our earlier work ll27l . Here we will impose much weaker and more natural restrictions motivated by the notion of 
coherence. Recall that a real valued function / with an unbounded domain dom(/) and limxedora(/), ||x|h+oo /(x) - +oa 
is called coercive (or Q-coercive) |23|. A nice feature of such functions is that the existence of a global minimizer 
is guaranteed. The objective function in (|7| is not coercive in general but we will show here that a mild condition 
on coherence, a notion that frequently appears in recent work on compressed sensing, allows us to obtain a coercive 
function and therefore cicumvent the non-existence difficulty. In the context of our application in Section|2] coherence 
quantifies the minimal angular separation in space or the minimal cross correlation in time of the radiating sources. 

Definition 4.2. Let Wbe a Hilbert space and Vi, . . . , Vr e H fee a finite collection of unit vectors, i.e. HVpHu — 1. The 
coherence of the collection V = {vi , . . . , Vr) is defined as piV) :- vimXp^q\{y p, v^)|. 

This notion has been introduced in slightly different forms and names: mutual incoherence of two dictionaries 
ifTSl . mutual coherence of two dictionaries 14], the coherence of a subspace projection |5 |, etc. The version here 
follows that of ITSl . We will be interested in the case when H is finite dimensional (in particular H - C'^'"^" or C"). 
When H = C"', we often regard V as an m x r matrix whose column vectors are Vi, . . . , v,-. Clearly < ^liV) < 1, 
piV) = iff Vi, . . . , are orthonormal, and p(V) = 1 iff V contains at least a pair of collinear vectors. 

While a solution to the best rank-r approximation problem (|7]l may not exist, the following shows that a solution 
to the bounded coherence best rank-r approximation problem (|8]l always exists. 

Tlieorem 4.3. Let A e C'^™^" and let%l = {t/ g C'^'' | p{U) < pi), 'V ^ {V e C"^'' | p{V) < pi}, ^ \W e C"^'' | 
p(W) < Pt], be families of dictionaries of unit vectors of coherence not more than pi,p2,P3 respectively. If 

1 

PiP2f^3 < 



then the infimum rj defined as 

77 = inf| ^ ~ ^ _| ^pU;, ® Vy, ® w 
is attained. Here \\-\\ denotes any norm on C'^™^". 



AeCU e%l,V e'V,W e^V\ (8) 
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Proof. Since all norms are equivalent on a finite dimensional space, we may assume that || ■ || - || ■ \\f, the Frobenius 
norm. Let the objective function / : C x 1/ x ^ x ^ [0, oo) be 



f{A,U,V,W):-- 



(9) 



Let £ = C''x'Z/x^x Note that £ as a subset of C(\+i+m+ri) noncompact (closed but unbounded). We write 
T - {A, U, V, W) and let the infimum in question he t] :- inf{/(r) | T e £). We will show that the sublevel set of 
/ restricted to S, defined as £q, = {T e £ | f{T) < a}, is compact for all a > 77 and thus the infimum of / on fi is 
attained. The set £„ - S n /"'(-oo, a] is closed since £ is closed and / is continuous (by the continuity of norm). It 
remains to show that &a is bounded. Suppose the contrary. Then there exists a sequence (T'jt)^! c £ with ||r/.||2 — > 00 
but f(Tk) < a for all k. Clearly, ||r;i||2 ^ 00 impUes that ||/l*||2 ^ 00. Note that 



f(T) > 



\\A\\[ 



/l„u„ 



1 \p ® 



n2 



We have 



>-Z, 



2 
2 



■ ^J■\^J■2|J■i 



p+q 

\ApA^^ 



/^i/"2A'3lWli > (1 - r^ii^2ii'i)\\A\. 



The last inequality follows from < V^'ll^lb for any A e C. By our assumption 1 
||A^*'||2 — » 00, /(r^) — > 00, which contradicts the assumption that fijk) < a for all k. 



ryu i//2A'3 > ™d so as 

□ 



5. Uniqueness 

While never formally stated, one of the main maxims in compressed sensing is that 'uniqueness implies sparsity'. 
For example, this is implicit in various sparsest recovery arguments in (H [15] [TSl where, depending on context, 
'sparsest' may also mean 'lowest rank'. We state a simple formulation of this observation for our purpose. Let £) be a 
dictionary of atoms in a vector space V (over an infinite field). We do not require D to be finite or countable. In almost 
all cases of interest D will be overcomplete with high redundancy. For x e V and r e N, by a ©-representation, we 
shall mean a representation of the form x = aiXi + • ■ ■ + a^Xr where Xi , . . . , Xr € D and o-i ■ ■ ■ o-^ (xi , . . . , x^ are 
not required to be distinct). 

Lemma 5.1. Let x = ffiXi + ■ ■ ■ + a^Xr be a D- representation, (i) If this is the unique D-representation with r terms, 
then Xi, . . . ,Xr must be linearly independent, (ii) If this is the sparsest D-representation, then Xi, . . . ,Xr must be 
linearly independent. (Hi) If this is unique, then it must also be sparsest. 

Proof. Suppose /3]Xi + ■ ■ ■ + /3rXr = is a nontrivial linear relation, (i): Since not all yS, are while all a, + 0, for 
some we must have {a\ + Qfi\) ■ ■ ■ [a,- + 0/3r) + 0, which yields a different ©-representation x = x -H 00 - 
{a\ H- 6{8i)xi H- ■ ■ ■ -H (a,. H- Q^Mr- (ii): Say yS^ + 0, then x = (ai - yS^'ySOxi H- ■ ■ ■ + (ar-\ - /3^^/3r-\)x,-i is a 
sparser ©-representation, (iii): Let x = yiyi -H ■ ■ ■ -H JsYs be a ©-representation with s < r. Write yi - Y^kJi^^ ^^yi 
with2;^:f' 6»i = 1. Then we obtain an r-term ©-representation Yj'k=T^ Ti yi + Yj'i=2 7' V'^ different from the given 
one. They are different since yi, yi, . . . , yi, y2, ■ • • ,ys are linearly dependent, whereas (i) implies that Xi, . . . ,Xr are 
linearly independent. □ 

We will now discuss a combinatorial notion useful in guaranteeing uniqueness or sparsity of ©-representations. 
The notion of the girth of a circuit |29] is standard and well-known in graphical matroids — it is simply the length 
of a shortest cycle of a graph. However the girth of a circuit in vector matroids, i.e. the cardinality of the smallest 
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linearly dependent subset of a collection of vectors in a vector space, has rarely appeared in linear algebra. This has 
led to it being reinvented multiple times under different names, most notably as Kruskal rank or k-rank in tensor 
decompositions I.24J . as spark in compressed sensing |15|, and as k-stability in coding theory |38|. The notions of 
girth, spark, ^-rank, and ^-stability 1381 are related as follows. 



Lemma 5.2. Let Y be a vector space over afield F and X — {xi, . . . ,x„} be a finite subset ofY. Then 

girth(X) = spark(X) = krank(X) + 1 
and furthermore X is k-stable iykrank(X) — n — k. 

Proof. These follow directly from the respective definitions. □ 

These notions are unfortunately expected to be difficult to compute because of the following result ||36l . 
Theorem 5.3 (Vardy). It is NP-hard to compute the girth of a vector matroid over a finite field of two elements, F2. 

A consequence is that spark, fc-rank, ^-stabiUty are all NP-hard if the field is F2. We note here that several authors 
have assumed that spark is NP-hard to compute over F = K or C (assuming Q or Q[i] inputs) but this is actually 
unknown. In particular it does not follow from |28|. While it is clear that computing spark via a naive exhaustive 
search has complexity 0(2"), one may perhaps do better with cleverer algorithms when F = M or C; in fact it is 
unknown in this case whether the corresponding decision problem (Given finite X cY and s e N, is spark(X) - si) is 
NP-hard. On the other hand it is easy to compute coherence. Even a straightforward search for an off-diagonal entry 
of X^X of maximum magnitude is of polynomial complexity. An important observation of 1 15 1 is that coherence may 
sometimes be used in place of spark. 

One of the early results in compressed sensing ifTSlfTSl on the uniqueness of the sparsest solution is that if 

i spark(X) > l^llo = cardfyS,- ^ 0}, (10) 

then 6 C" is a unique solution to min{||j8||o | X/3 - x). 

For readers familiar with Kruskal's condition that guarantees the uniqueness of tensor decomposition, the parallel 
with ( [TOj i is hard to miss once we rewrite Kruskal's condition in the form 

^ [krank(X) + krank(F) + krank(Z)] > rank(A). (11) 



We state a slight variant of Kruskal's result 112411 here. Note that the scaling ambiguity is unavoidable because of the 
multilinearity of ®. 

Theorem 5.4 (Kruskal). If A - 'Zp=i Xp ® ® Zp and krank(X) + krank(F) + krank(Z) > 2r + 2, then r - rank(A) 
and the decomposition is unique up to scaling of the form ax®l3y®jz = x ® y ® ifor a,fi, y e C with afiy — 1. This 
inequality is also sharp in the sense that 2r + 2 cannot be replaced by 2r + 1. 

Proof. The uniqueness was Kruskal's original result in [24] ; alternate shorter proofs may be found in ll25l l32l l34l . 
That r - rank(A) then follows from Lemma [5T| ^iii). The sharpness of the inequality is due to ifTSll . □ 

Since spark is expected to be difficult to compute, one may substitute coherence to get a condition lITSlfTSl that is 
easier to check 

1 
2 



> \m\o- (12) 



l^iX)\ 

The equation ( [T2| l relaxes ( fTO] ) because of the following result of ifTSllTSll . 

Lemma 5.5. Let Mbe a Hilbert space and V — . . . ,\r} be a finite collection of unit vectors in H. Then 

spark(y) > 1 + and krank(y) > ^ 



fi{V) fiiV) 
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Proof. Let spark(y) - s — krank(y) + 1. Assume without loss of generality that {vi, . . . , Vj) is a minimal circuit 
of V and that aiVi + ■■■ + ajV., - with lo-il = maxjlail, . . . , |aj|) > 0. Taking inner product with Vi we get 
Qfi = -Q'2(v2, Vi) Qfs<Vi, Vi> and so Iffil < (|q'2| + - ■ ■ + \os\)f^(V). Dividing by \ai \ then yields 1 < {s-l)iJ.(V). □ 

We now characterize the uniqueness of tensor decompositions in terms of coherence. Note that C may be replaced 
by R. By "unimodulus scaling", we mean scaling of the form e'^'u ® e'^^v ® e'*w where 6i + 02 + 63 = 0mod27r. 

Theorem 5.6. Let A e C'^"^" and A - Yjp=i ^p»p ® » w^, where Ap e C, Ap + 0, and Huplb = Hv^lb = Hwplb - 1 
for ail p - 1, . . . , r. We write U = {ui , . . . , u, ), V = {vi , . . . , Vr), W - {wi , . . . , w, ). If 



1 1 1 
+ h 



2 Y^i{u) //(V) ^l{W) 

then r — rank(A) and the rank revealing decomposition is unique up to unimodulus scaling. 

Proof. If ( [T3| ) is satisfied, then Kruskal's condition for uniqueness ( [TT| must also be satisfied by Lemma 53 



(13) 



□ 



Note that unhke the ^-ranks in ( fTTj i, the coherences in ( pj) are trivial to compute. In addition to uniqueness, an 
easy but important consequence of Theorem [53] is that it provides a readily checkable sufficient condition for tensor 
rank, which is NP-hard over any field EOl l22r 

6. Conclusion 



The following existence and uniqueness result may be deduced from Theorems 4.3 and 5.6 
Corollary 6.1. Let A g C'^'"^". IfiJi,iJ2,l^3 e (0, 00) satisfy 

1 2 

> -r, 

iFmm 3 

then the bounded coherence rank-r approximation problem ([8| has a solution that is unique up to unimodulus scaling. 



Proof. The case r = 1 is trivial. For r > 2, since p\P2tJ^3 < (3/2r)-' < 1/r, Theorem 4.3 guarantees that a solution to 
(|8]l exists. Let Ar - /liUi (g) Vi ® Wi + ■ ■ ■ + ArUr ® ® w,- be a solution and let t/ = {ui, . . . , u,), V - {vi, . . . , v,), 
W - {wi, . . . , Wr). Since p{U) < pi, p(V) < p2, p(W) < //3,the harmonic mean-geometric mean inequality yields 



1 



1 



1 



piU) p{V) p{W) 



^p{U)p{V)p{W) < iJI^i 



2r 



the decomposition of A,- is unique by Theorem 5.6 



□ 



In the context of our application in Section 2. 1 this corollary means that radiating sources can be uniquely lo- 



calized if they are either (i) sufficiently separated in space (angular separation viewed by a subarray, or by the array 
defined by translations between subarrays), or (ii) in time (small sample cross correlations), noting that the scalar 
product between two time series is simply the sample cross correlation. Contrary to more classical approaches based 
on second or higher order moments, both conditions are not necessary here — Corollary 6.1 requires only that the 
product between coherences be small. In addition, there is no need for long data samples since the approach is deter- 
ministic; this is totally unusual in antenna array processing. Cross correlations need to be sufficiently small among 
sources only for identifiability purposes but they are not explicitly computed in the identification process. Hence our 
model is robust with respect to short record durations. Observe also that the number of time samples can be as small 
as the number of sources. Lastly, an estimate of source time samples may be obtained from the tensor decomposition 
as a key byproduct of this deterministic approach. 
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